Skip to content

lastgenre: Blacklist wrong last.fm genres#5744

Open
JOJ0 wants to merge 3 commits intolastgenre_split_monolithfrom
lastgenre_forbidden
Open

lastgenre: Blacklist wrong last.fm genres#5744
JOJ0 wants to merge 3 commits intolastgenre_split_monolithfrom
lastgenre_forbidden

Conversation

@JOJ0
Copy link
Member

@JOJ0 JOJ0 commented Apr 20, 2025

Description

Adds a global and artist-specific genre blacklist to lastgenre. Blacklist entries can use regex patterns or literal genre names and are configurable per artist or globally (*). Example:

fracture:
    ^(heavy|black|power|death)?\s?(metal|rock)$
*:
    electronic

This avoids incorrect genre assignments for artists with overlapping names (e.g., “Fracture” the DnB producer vs a Metal band)..

  • A mixture of regex and simple genre names is possible since regex patterns are precompiled and if that fails, escaped literal strings are added to the blacklist.
  • Blacklist filtering occurs at two stages: immediately after Last.fm fetching to prevent blacklisted tags from entering the pipeline, and again during genre resolution to filter blacklisted genres from existing file tags.

Why invent a file format?

YAML and INI formats were tested but proved cumbersome for regex patterns. A simple custom format is user-friendly and allows all regex patterns without special escaping or formatting.

Further notes

My personal roadmap for the lastgenre plugin: #5915

Feature idea originated from: #5721 (comment).

To Do

  • Documentation.
  • Changelog.
  • Tests.

@github-actions
Copy link

Thank you for the PR! The changelog has not been updated, so here is a friendly reminder to check if you need to add an entry.

@JOJ0 JOJ0 changed the title Lastgenre forbidden Lastgenre forbidden for artist list Apr 20, 2025
@JOJ0 JOJ0 force-pushed the lastgenre_forbidden branch from 256a8a1 to c8199cb Compare August 4, 2025 08:48
@codecov
Copy link

codecov bot commented Aug 4, 2025

Codecov Report

❌ Patch coverage is 72.64151% with 29 lines in your changes missing coverage. Please review.
✅ Project coverage is 69.59%. Comparing base (3e1a22a) to head (4bc1341).
✅ All tests successful. No failed tests found.

Files with missing lines Patch % Lines
beetsplug/lastgenre/client.py 15.00% 17 Missing ⚠️
beetsplug/lastgenre/__init__.py 86.04% 6 Missing and 6 partials ⚠️
Additional details and impacted files
@@                     Coverage Diff                      @@
##           lastgenre_split_monolith    #5744      +/-   ##
============================================================
+ Coverage                     69.57%   69.59%   +0.01%     
============================================================
  Files                           142      142              
  Lines                         18508    18598      +90     
  Branches                       3026     3054      +28     
============================================================
+ Hits                          12877    12943      +66     
- Misses                         4995     5014      +19     
- Partials                        636      641       +5     
Files with missing lines Coverage Δ
beetsplug/lastgenre/__init__.py 78.73% <86.04%> (+3.08%) ⬆️
beetsplug/lastgenre/client.py 41.79% <15.00%> (-12.38%) ⬇️

... and 1 file with indirect coverage changes

🚀 New features to boost your workflow:
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@JOJ0 JOJ0 force-pushed the lastgenre_forbidden branch 8 times, most recently from 9cd4947 to a0e6054 Compare August 4, 2025 15:41
@JOJ0 JOJ0 changed the title Lastgenre forbidden for artist list Lastgenre global and artist-based genre _blacklist_ Aug 5, 2025
@JOJ0 JOJ0 changed the title Lastgenre global and artist-based genre _blacklist_ lastgenre: Global and artist-based genre blacklist Aug 5, 2025
@JOJ0 JOJ0 force-pushed the lastgenre_forbidden branch from a3daa01 to 4868326 Compare August 5, 2025 16:10
@JOJ0 JOJ0 marked this pull request as ready for review August 5, 2025 22:28
Copilot AI review requested due to automatic review settings August 5, 2025 22:28
@github-actions
Copy link

github-actions bot commented Aug 5, 2025

Thank you for the PR! The changelog has not been updated, so here is a friendly reminder to check if you need to add an entry.

This comment was marked as outdated.

@JOJ0 JOJ0 force-pushed the lastgenre_forbidden branch from 5b24d49 to 13a01b1 Compare August 6, 2025 05:07
@JOJ0 JOJ0 marked this pull request as draft August 18, 2025 09:20
@JOJ0
Copy link
Member Author

JOJ0 commented Aug 18, 2025

Setting this back to draft. I'd like to fix this bug first: #5930

@JOJ0 JOJ0 force-pushed the lastgenre_forbidden branch 3 times, most recently from 58c1299 to 9e5f168 Compare August 28, 2025 05:39
@JOJ0 JOJ0 marked this pull request as ready for review August 28, 2025 06:34
@JOJ0 JOJ0 force-pushed the lastgenre_split_monolith branch from 9ff6150 to 0f131b5 Compare February 15, 2026 17:42
@JOJ0 JOJ0 force-pushed the lastgenre_forbidden branch 2 times, most recently from 723345e to 81c3219 Compare February 15, 2026 18:00
@JOJ0 JOJ0 force-pushed the lastgenre_split_monolith branch 2 times, most recently from b1e9732 to 22de6ee Compare February 19, 2026 07:44
@JOJ0 JOJ0 force-pushed the lastgenre_split_monolith branch 5 times, most recently from 998f9a6 to 751d5d1 Compare February 28, 2026 06:37
@JOJ0 JOJ0 force-pushed the lastgenre_forbidden branch from 81c3219 to f07cb30 Compare March 1, 2026 09:28
@JOJ0 JOJ0 requested a review from Copilot March 1, 2026 09:39
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 7 out of 7 changed files in this pull request and generated 7 comments.

@JOJ0 JOJ0 force-pushed the lastgenre_split_monolith branch 6 times, most recently from 42b0d45 to e49d7a3 Compare March 7, 2026 07:12
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 4 out of 4 changed files in this pull request and generated 6 comments.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 4 out of 4 changed files in this pull request and generated 3 comments.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 4 out of 4 changed files in this pull request and generated 2 comments.

Comments suppressed due to low confidence (1)

beetsplug/lastgenre/init.py:474

  • grug see log label suffix only say "whitelist" or "any". but now blacklist can filter too, so label "any" lie when blacklist active. grug want suffix reflect blacklist (ex: "any+blacklist" or similar) so debug logs tell truth.
                suffix = "whitelist" if self.whitelist else "any"
                label = f"{stage_label}, {suffix}"

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 4 out of 4 changed files in this pull request and generated 1 comment.

Comments suppressed due to low confidence (2)

beetsplug/lastgenre/init.py:483

  • grug see log label say whitelist vs any based only on self.whitelist. but now blacklist can filter too. when whitelist off but blacklist on, label say any even though not any. please include blacklist state in label (ex: blacklist, or whitelist+blacklist) so logs tell truth.
                suffix = "whitelist" if self.whitelist else "any"
                label = f"{stage_label}, {suffix}"
                if keep_genres:
                    label = f"keep + {label}"
                return self._format_genres(resolved_genres), label

beetsplug/lastgenre/init.py:322

  • grug see _resolve_genres docstring still talk only about whitelist filtering. now blacklist also filter and can skip c14n parent walk. please update docstring bullets so match new behavior (whitelist+blacklist).
        """Canonicalize, sort and filter a list of genres.

        - Returns an empty list if the input tags list is empty.
        - If canonicalization is enabled, it extends the list by incorporating
          parent genres from the canonicalization tree. When a whitelist is set,
          only parent tags that pass the whitelist filter are included;
          otherwise, it adds the oldest ancestor. Adding parent tags is stopped
          when the count of tags reaches the configured limit (count).
        - The tags list is then deduplicated to ensure only unique genres are
          retained.
        - If the 'prefer_specific' configuration is enabled, the list is sorted
          by the specificity (depth in the canonicalization tree) of the genres.
        - Finally applies whitelist filtering to ensure that only valid
          genres are kept. (This may result in no genres at all being retained).
        - Returns the filtered list of genres, limited to the configured count.

@github-actions
Copy link

Thank you for the PR! The changelog has not been updated, so here is a friendly reminder to check if you need to add an entry.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 4 out of 4 changed files in this pull request and generated 2 comments.

Comments suppressed due to low confidence (1)

beetsplug/lastgenre/init.py:486

  • grug see log label pick suffix "any" when whitelist off. but now blacklist can be on, so it not really "any". log say wrong thing, debug harder. suggest include blacklist state in suffix (ex: "blacklist" or "any+blacklist").
            if resolved_genres:
                suffix = "whitelist" if self.whitelist else "any"
                label = f"{stage_label}, {suffix}"
                if keep_genres:

Comment on lines +275 to +278
Tries regex compilation first, falls back to literal string matching.
That way users can use regexes for flexible matching but also simple
strings without worrying about regex syntax. All patterns are
case-insensitive.
if filtered_genre != genre:
log_filtered = set(genre) - set(filtered_genre)
extra_debug(self._log, "blacklisted: {}", log_filtered)
genre = filtered_genre
JOJ0 added 3 commits March 15, 2026 22:59
- Test file format (valid and error cases)
- Test regex pattern matching (_is_blacklisted)
- Test _resolve_genres: blacklisted genres filtered
- Test _resolve_genres: c14n ancestry walk blocked for blacklisted tags
- Prevents wrong last.fm genres based on a per artist list of forbidden
  regex patterns. Blacklisting happens in two places: Right after
  fetching the last.fm genre and in _resolve_genres.
- Includes docs for the new feature.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

lastgenre plugin Pull requests that are plugins related

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants